Reduced complexity correlation filters

ABSTRACT

A methodology is described to reduce the complexity of filters for face recognition by reducing the memory requirement to, for example, 2 bits/pixel in the frequency domain. Reduced-complexity correlations are achieved by having quantized MACE, UMACE, OTSDF, UOTSDF, MACH, and other filters, in conjunction with a quantized Fourier transform of the input image. This reduces complexity in comparison to the advanced correlation filters using full-phase correlation. However, the verification performance of the reduced complexity filters is comparable to that of full-complexity filters. A special case of using 4-phases to represent both the filter and training/test images in the Fourier domain leads to further reductions in the computational formulations. This also enables the storage and synthesis of filters in limited-memory and limited-computational power platforms such as PDAs, cell phones, etc. An online training algorithm implemented on a face verification system is described for synthesizing correlation filters to handle pose/scale variations. A way to perform efficient face localization is also discussed. Because of the rules governing abstracts, this abstract should not be used to construe the claims.

REFERENCE TO RELATED APPLICATION

The disclosure in the present application claims priority benefits ofthe earlier filed U.S. provisional patent application Ser. No.60/474,019, titled “Reduced Complexity Correlation Filters,” filed onMay 29, 2003, the entirety of which is hereby incorporated by reference.

BACKGROUND

1. Field of the Disclosure

The present disclosure generally relates to image pattern recognitionfor use in various applications including automatic target recognition,face recognition, fingerprint recognition, and iris recognition, amongothers and, more particularly, to image pattern recognition usingcorrelation filters.

2. Brief Description of Related Art

Ever since the advent of the optical frequency plane correlator orcorrelation filter (A. VanderLugt, IEEE Trans. Inf. Th., 10 (1964) 139),there has been considerable interest in using correlators for patternrecognition. Correlators are shift-invariant filters (i.e., no need tocenter the input face image during testing), which allow one to locatethe object of interest in the input scene merely by locating thecorrelation peak. Thus, one may not need to segment or register anobject in the input scene prior to correlation, as is required in manyother image pattern recognition methods. Much of the earlier work incorrelation-based pattern recognition was devoted to recognizingmilitary vehicles in scenes. Correlators (or more correctly, correlationfilters) can be used for recognition of other patterns such asfingerprints, face images, etc. Authenticating the identity of a userbased on their biometrics (e.g., face, iris, fingerprint, voice, etc.)is a growing research topic with wide range of applications ine-commerce, computer security and consumer electronics. Inauthentication (also termed “verification”), a stored biometric iscompared to a live biometric to determine if the live biometric is thatof an authorized user or not. There is a wide range of computingplatforms that can be used to host biometric authentication systems.With current desktop computing power, researchers may not have to worryabout the complexity of the algorithms; however, embedding suchverification modules in small form factor devices such as cell phonesand PDAs (Personal Digital Assistant) is a challenge as these platformsare limited by their memory and computing power. In applications wherethese filters are stored directly on a chip (such as in system-on-chipimplementations), the memory available may be limited. Therefore, it isdesirable to devise correlation filters with reduced memoryrequirements.

The matched spatial filter (D. O. North, Proc. IEEE, 51 (1963) 1016)(MSF) is based on a single view of the target and is optimal (in thesense of yielding maximal signal-to-noise ratio (SNR)) for detecting acompletely known pattern in the presence of additive, white noise (noisewith equal power at all frequencies). Unfortunately, MSFs are notsuitable for practical pattern recognition because their correlationpeak degrades rapidly when the input patterns deviate (sometimes evenslightly) from the reference. These variations in the patterns are oftendue to common phenomena such as pose, illumination and scale changes. Inoptical implementations, the Matched Spatial Filter (MSF) is representedby a transparency, thus the transmittance of the filter is less than orequal to 1 at all spatial frequencies. This causes much of the incominglight to be attenuated causing low levels of light for the detector inthe correlation plane. To address this issue, Homer and Gianino (J. L.Homer, and P. D. Gianino, Appl. Opt. 23, 812-816, 1984) suggestedsetting the filter magnitude to 1 at all frequencies. Thus the resultingfilter contains only phase information and is known as the Phase-OnlyFilter (POF). POF has 100% light throughput efficiency.

In optical correlators, matched filters are represented on spatial lightmodulators (SLMs) which convert electrical inputs to optical propertiessuch as transmittance and reflectance. Examples of SLMs aremagneto-optic SLM (MOSLM) and liquid crystal display (LCD). Themagneto-optic SLM can be operated in two levels of effectivetransmittance and to accommodate the limitations of the magneto-opticSLM, Psaltis et al. (D. Psaltis, E. G. Paek, and S. S. Venkatesh, Opt.Eng, 23, 698-704, 1984), and Homer and others (J. L. Homer and H. O.Bartlett, Appl. Opt. 24, 2889-2893, 1985, J. L. Homer and J. R. Leger,Appl. Opt. 24, 609-611, 1985), suggested the use of Binary Phase OnlyFilters (BPOF) which only use two levels in the filter. Psaltis et al.supra, suggested binarizing the real part of the matched spatial filter,while Homer and others, supra, suggested the binarization of theimaginary part of the matched spatial filter. Later, Cottrell et al. (D.M. Cottrell, R. A. Lilly, J. A. Davis, and T. Day, Appl. Opt. 26,3755-3761, 1987) proposed the binarizing of the sum of the real andimaginary parts of the matched spatial filter. The main attribute of theBPOFs is that they are well suited for implementation on a binary SLMsuch as the magneto-optic SLM and were not designed specifically fordigital implementations.

Dickey and Hansche (F. M. Dickey, and B. D. Hansche, Appl. Opt. 28,1611-1613, 1989) extended the BPOF idea to the Quad-Phase Only Filters(QPOFs), that have 4 possible phase levels (namely ±π/4, ±3π/4) whichcould be implemented using two MOSLMs (each capable of providing only 2phase levels) to effectively obtain the 4 phases needed in a QPOF. QPOFsare all-pass filters and they are based on a single image. A QPOF is anall-pass filter and has no ability to suppress noise. An effort toimprove the signal-to-noise ratio (SNR) of the QPOF led to thedevelopment of the complex ternary matched filter (CTMF) defined below(F. M. Dickey, B. V. K. Vijaya Kumar, L. A. Romero, and J. M. Connely,Opt. Eng. 29, 994-1001,1990).H _(CTMF)(u,v)=H _(R)(u,v)+jH _(I)(u,v)  (1)where H_(R) (u,v), the real part of the filter transfer functionH_(CTMF) and H_(I) (u,v), the imaginary part of H_(CTMF) take on 3levels (namely −1, 0 and +1) at each frequency (u,v).

However all the above filters are made from a single reference image andthus are sensitive to any distortions from this reference image. Oneapproach to overcoming the distortion sensitivity of the MSF is to useone MSF for every view. However, this leads to the requirement to storeand use a large number of filters that make this approach impractical.The alternative is to design composite filters that can exhibit betterdistortion tolerance than the MSFs. Composite filters (also known asSynthetic Discriminant Function or SDF filters)(C. F. Hester and D.Casasent, Appl. Opt., 19 (1980) 1758; B. V. K. Vijaya Kumar, Appl. Opt.,31 (1992) 4773) use a set of training images to synthesize a templatethat yields pre-specified correlation outputs in response to trainingimages.

If matched filters are used, many filters would be needed, approximatelyone filter for each view. When one thinks of possible distortions (e.g.,illuminations, expressions, pose changes, etc.), this is clearly toomany filters to store and use. Therefore, Hester and Casasent, supra,introduced the concept of SDF filters in 1980. The first SDF filterrequired that the associated composite template be a weighted sum oftraining images with the weights chosen so that resulting correlationoutput values at the origin take on pre-specified (non-zero) values.This filter proved to be unattractive as it almost always led tosidelobes that are much larger than the correlation “peak” (thecorrelation value at the origin is loosely referred to herein as thecorrelation peak). Kumar (B. V. K. Vijaya Kumar, JOSA-A, 3 (1986) 1579)introduced the minimum variance SDF (MVSDF) formulation that minimizedthe output noise variance from the SDF filters. The sidelobe problem wasaddressed by the minimum average correlation energy (MACE) filtersintroduced by Mahalanobis et al. (A. Mahalanobis et al., Appl. Opt., 26(1987) 3633). Refregier (Ph. Refregier, Opt. Lett., 16 (1991) 829)showed how to optimally trade off the noise tolerance and peak sharpnessattributes of correlation filters. These and many other SDF filterdevelopments were summarized in the tutorial review paper by Kumar (B.V. K. Vijaya Kumar, Appl. Opt., 31 (1992) 4773).

Correlation filters offer several advantages including shift-invariance(i.e., no need to center the input face image during testing),closed-form solutions, graceful degradation (i.e., loss of parts ofinput image results in slow loss of correlation peak) and ability todesign built-in tolerance to normal impairments such as expression,illumination and pose changes. Correlation filters have been used widelyin the areas of signal detection and automatic target recognition. Asnoted before, the matched filter is known to be optimal for detecting aknown signal or image in the presence of additive white Gaussian noise(AWGN). When the noisy reference image or signal is input to the matchedfilter, its output is the cross correlation of the noisy input imagewith the reference image. If the noisy input contains a replica of thereference, the correlation output will have a large value (the“correlation peak”) at a location corresponding to the location of thereference image in the input scene and small values elsewhere. The valueof the correlation peak is a measure of the likelihood that the inputscene contains the reference image and the location of the peak providesthe location of the reference image in the input scene. Thus, matchedfilters are well suited for both detecting the presence of a referenceimage in a noisy input scene as well as locating it. However, as notedbefore, matched filters suffer from the problem that the correlationpeak degrades significantly when the target object exhibits appearancechanges due to normal factors such as illumination changes, posevariations, facial expressions, etc. Therefore, it is desirable todevise filters that are tolerant to such variability.

There are two main stages in correlation-based pattern recognition.First is the correlation filter design (also called “the enrollmentstage”) and the second is the use of the correlation filters (alsocalled “the verification stage”). This correlation-based patternrecognition process is shown schematically in FIG. 1.

In the first stage (the enrollment or training stage), training imagesare used to design the correlation filter. The training images reflectthe expected variability in the final image to be verified. For example,in designing a correlation filter for verifying the face of a person A,the person A's face images with a few expected variations (e.g., pose,expression, illumination) are acquired during the enrollment stage.These images are used to construct a correlation filter according to acarefully chosen performance metric and rigorously derived closed-formexpressions. Most advanced correlation filter designs are in frequencydomain. Thus, the training images are used to construct one or a fewfrequency-domain arrays (loosely called correlation “filters” or“templates”) that are stored in the system. Once the filters arecomputed, the filter arrays are stored and there is no need to store thetraining images. The authentication performance of the system dependscritically on these stored filters. They must be designed to producelarge peaks in response to images (many not seen during training) of theauthentic user, small values in response to face images of impostors,and be tolerant to noise in the input images. As some of these goals areconflicting, optimal tradeoffs may be devised.

In the second stage, the input test image (e.g., someone's face image)is presented for verification and/or identification. In verificationproblems, the user claims his/her identity and the task is to comparethe input image to the claimed identity and decide whether they match ornot. In the identification problem, the user input is matched against adatabase of stored images (or equivalently, filters) to see which storedtemplate best matches the input image. In either case, the 2-D (twodimensional) fast Fourier transform (FFT) of the test input is firstcomputed and then multiplied by the stored templates (i.e., filterarrays). Thereafter, an inverse FFT (IFFT) of that product is performedto obtain the correlation output. If the input matches thetemplate/filter, a sharply-peaked correlation output is obtained as inFIG. 2A, and when the two do not match, a correlation output with lesssharp peaks as in FIG. 2B is obtained. Thus, one can use sharpness ofthe correlation peak for verification or identification—sharpcorrelation peaks relate to the images from the authentic, whereas nolarge discernible peaks result from face images from impostors. It isnoted here that those skilled in the art would recognize that FFT is anefficient algorithm to compute the discrete Fourier transform (DFT) and,hence, the phrases “discrete Fourier transform” or “DFT” and “fastFourier transform” or “FFT” are used interchangeably herein.

The following figure of merit, known as the peak-to-sidelobe ratio(PSR), is usually employed to measure the peak sharpness: First, thepeak (i.e., the largest value) is located in the correlation output, anda small (e.g., of size 5×5) mask is centered at the peak. The sideloberegion is defined as the annular region between this small mask and alarger (e.g., of size 20×20) square also centered at the origin. Theannular region may be rectangular or square or in any other suitablepolygonal shape. The mean and standard deviation (“σ”) of the sideloberegion are computed and used to estimate the PSR using Eq. (2). PSRestimation is depicted pictorially in FIG. 3. $\begin{matrix}{{PSR} = \frac{{peak} - {mean}}{\sigma}} & (2)\end{matrix}$The small mask size and the larger square sizes are somewhat arbitraryand are usually decided through numerical experiments. The basic goal isto be able to estimate the mean and the standard deviation of thecorrelation output near the correlation peak, but excluding the areaclose to the peak. The PSR is unaffected by any uniform illuminationchange in the input image. Thus, for example, if the input image ismultiplied by a constant “k” (e.g., uniform illumination), the resultingcorrelation output will also be multiplied by the same factor. Thus,peak, mean and standard deviation all increase by “k”, making the PSRinvariant to “k.” This can be useful in image problems where brightnessvariations are present. The PSR also takes into account multiplecorrelation points in the output plane (not just the peak), and thus itcan be considered to lead to a more reliable decision. In order for atest image to be declared to belong to the trained class, thecorrelation peak should not only be large, but the neighboringcorrelation values should be small. Thus, the final verificationdecision is based on examining the outputs of many inner products(correlation region around the peak), rather than just one inner product(the correlation peak value.

In the discussion below, 1-D (one dimensional) notation is used forconvenience, but all equations can easily be generalized to higherdimensions. Let f₁(n), f₂(n), . . . , f_(N)(n) denote the trainingimages (each with L pixels) from the authentic class, and let F₁(k),F₂(k), . . . , F_(N)(k) denote their Fourier transforms. Let H(k) denotethe filter. Then the correlation output c_(i)(n) when the input image isf_(i)(n) is given as follows. Note that j={square root}{square root over(−1)}. $\begin{matrix}{{c_{i}(n)} = {\frac{1}{L}{\sum\limits_{k = 1}^{L}{{F_{i}^{*}(k)}{H(k)}{\exp\left\lbrack {{+ j}\frac{2\pi\quad\left( {k - 1} \right)n}{L}} \right\rbrack}}}}} & (3)\end{matrix}$

In 2-D, let g(m,n) denote the correlation surface produced by thetemplate h(m,n) in response to the input image f(m,n). Strictlyspeaking, the entire correlation surface g(m,n) is the output of thefilter. However, the point g(0,0) is often referred to as “thecorrelation output or the correlation peak at the origin”. By maximizingthe correlation output at the origin, the real peak may be forced to beeven larger. With this interpretation, the correlation peak is given by$\begin{matrix}{{g\left( {0,0} \right)} = {{\sum{\sum{{f\left( {m,n} \right)}{h\left( {m,n} \right)}}}} = {f^{T}h}}} & (4)\end{matrix}$where superscript T denotes the vector transpose and where f and h arethe column vector versions of f(m,n) and h(m,n), respectively. In thediscussion hereinbelow, matrices will be represented by upper case boldletters and vectors are by lower case bold letters.

Composite filters are derived from several training images that arerepresentative views of the object or pattern to be recognized. Inprinciple, such filters can be trained to recognize any object or typeof distortion as long as the distortion can be adequately represented bythe training images. The objective of a composite filter is to be ableto recognize the objects from one class (even non-training images),while being able to reject objects from other classes. The optimizationof carefully designed performance criteria offers a methodical approachfor achieving this objective.

In the early SDF filter designs, the filter was designed to yield aspecific value at the origin of the correlation plane in response toeach training image. The hope was that such a controlled value wouldalso be the peak in correlation plane. It was further theorized that theresulting filter would be able to interpolate between the trainingimages to yield comparable output values in response to other(non-training) images from the same class. A set of linear equationsdescribing the constraints on the correlation peaks can be written asX ⁺ h=u  (5)where h is the filter vector, superscript “+” (in X⁺) denotes theconjugate transpose, X=[x₁ x₂ . . . x_(N)] is an L×N matrix with the Ntraining image Fourier transform vectors (each with L elements, where Lis the number of pixels in the image) as its columns, and u=[u₁ u₂ . . .u_(N)]^(T) is an N×1 column vector containing the desired peak valuesfor the N training images. For training images from the desired class(also known as the true class), the constraint vales are usually set to1 and for images from the reject class (also known as the false class),they are usually set to 0.

However, because the number of training images N is generally much fewerthan the dimension L (i.e., the number of frequencies) of the filters,the system of linear equations in Eq. (5) is under-determined. Byrequiring that h is a linear combination of the training images, one canobtain a unique solution known as the equal correlation peak SDF(ECP-SDF).

The ECP-SDF suffers from the problem of large sidelobes. In practice, itis important to ensure that the correlation peak is sharp and thatsidelobes are suppressed. One way to achieve this is to minimize theenergy in the correlation plane. The minimum average correlation energy(MACE) filter minimizes the average correlation energy (ACE) definedbelow in Eq. (6) while satisfying the correlation peak constraints inEq. (5). $\begin{matrix}{E_{ave} = {{\frac{1}{N}{\sum\limits_{i = 1}^{N}{\sum\limits_{k}^{\quad}{\sum\limits_{l}^{\quad}{{{H\left( {k,l} \right)}}^{2}{{X_{i}\left( {k,l} \right)}}^{2}}}}}} = {h^{+}{Dh}}}} & (6)\end{matrix}$where D is a diagonal matrix containing the average training image powerspectrum along its diagonal. This leads to the closed form solution ofthe MACE filter shown in Eq. (7).h=D ⁻¹ X(X ⁺ D ⁻¹ X)⁻¹ u  (7)In the above equations, input images, frequency domain arrays andcorrelation outputs are assumed to be of size d×d and “N” is the numberof training images. Further, h is a d²×1 column vector containing the2-D correlation filter H(k,l) lexicographically reordered to 1-D, u is acolumn vector, and X is a d²×N complex matrix whose ith column containsthe 2-D Fourier transform of the ith training image lexicographicallyreordered into a column vector. As is known in the art, inlexicographical reordering, an image is reordered by scanning itrow-by-row and placing all the scanned elements in a vector (e.g., acolumn vector).

MACE filters have been shown to generally produce sharp correlationpeaks. They are the first set of filters that attempted to control theentire correlation plane. However, MACE filters suffer from two maindrawbacks. First, there is no built-in immunity to noise. Second, theMACE filters are often excessively sensitive to intra-class variations.

The minimum variance synthetic discriminant function (MVSDF) wasdeveloped to address the noise tolerance issue. Here, the filter h wasdesigned to minimize the effect of additive noise on the correlationoutput. Let the noise be of zero mean and let C be the diagonal noisepower spectral density (PSD) matrix in that the PSD of the noise isrepresented along the diagonal of C. Then the output noise variance(ONV) can be shown to be σ²=h^(T)Ch. The MVSDF minimizes σ² whilesatisfying the conditions in Eq. (5). Here, C is a d²×d² diagonal matrixwhose diagonal elements C(k,k) represent the noise power spectraldensity at frequency k. Minimizing ONV (σ²) subject to the usual linearconstraints of Eq. (5) leads to the following closed form solution:h=C ⁻¹ X(X ⁺ C ⁻¹ X)⁻¹ u  (8)The ECP-SDF is a special case of MVSDF in that it is obtained if thenoise is white, i.e., if C is equal to I, the identity matrix, then theMVSDF is same as the ECP SDF.

The MACE filter yields sharp peaks that are easy to detect while theMVSDF is designed to be more robust to noise. Since both attributes(namely, sharp peaks and noise tolerance) are desirable in practice, itis desirable to formulate a filter that possesses the ability to producesharp peaks and behaves robustly in the presence of noise. Refregier,supra, showed that one can optimally trade off between these two metrics(i.e., ONV and ACE). The resulting filter, named the Optimal Trade-offSDF (OTSDF) is given ash=T ⁻¹ X(X ⁺ T ⁻¹ X)⁻¹ u  (9)where T=(αD+{square root}{square root over (1−α²)}C), and 1≧α≧0. It isnoted here that when α=1, the optimal tradeoff filter reduces to theMACE filter given in equation (7), and when α=0, it simplifies to thenoise-tolerant filter in equation (8).

The ECP-SDF filter and its variants such as MVSDF filter and MACE filterassume that the distortion tolerance of a filter could be controlled byexplicitly specifying desired correlation peak values for trainingimages. The hard constraints in Eq. (5) may be removed becausenon-training images always yield different values than those specifiedfor the training images and no formal relation appears to exist betweenthe constraints imposed on the filter output and its ability to toleratedistortions. In fact, it is unclear that even intuitively satisfyingchoices of constraints (such as the Equal Correlation Peak (ECP)condition) have any significant positive impact on a filter'sperformance. Finally, relaxing or removing the hard constraints shouldincrease the domain of solutions.

Removing the hard constraints in Eq. (5) led to the introduction of theunconstrained MACE (UMACE) filter (A. Mahalanobis, B. V. K. VijayaKumar, S. R. F. Sims and J. F. Epperson, Appl. Opt.,33, 3751-3759,1994).Instead of constraining the peak value at the origin of the correlationoutput to take on a specific value, UMACE tries to maximize the peak atthe origin while minimizing the average correlation energy resultingfrom the cross-correlation of the training images. This is done byoptimizing the metric J(h) in Eq. (10). $\begin{matrix}{{J(h)} = \frac{h^{+}{mm}^{+}h}{h^{+}{Dh}}} & (10)\end{matrix}$which leads to the closed form solution in Eq. (11) for the UMACEfilter.h=D ⁻¹ m  (11)In equations (10) and (11), D is a diagonal matrix as defined earlier,and m denotes the Fourier transform of mean training image. It is notedthat both MACE and UMACE filters yield sharp correlation peaks becausethey are designed to minimize the average correlation energy.

Adding noise tolerance to the UMACE filter, as was done to MACE filters,yields the unconstrained optimal trade-off SDF (UOTSDF) given in Eq.(12).h=(αD+{square root}{square root over (1−α²)}C)⁻¹ m  (12)Varying α produces filters with optimal tradeoff between noise toleranceand discrimination. Typically, using α values close to, but not equal to1 (e.g., 0.99) improves the robustness of MACE filters.

Advances in correlation filters include considering the correlationplane as a new pattern generated by the correlation filter in responseto an input image. The correlation planes may be considered as linearlytransformed versions of the input image, obtained by applying thecorrelation filter. It can then be argued that if the filter isdistortion tolerant, its output will not change much even if the inputpattern exhibits some variations. Thus, the emphasis is not only on thecorrelation peak, but on the entire shape of the correlation surface.Based on the above, a metric of interest is the average variation inimages after filtering. If g_(i) (m,n) is the correlation surfaceproduced in response to the ith training image, we can quantify thevariation in these correlation outputs by the average similarity measure(ASM) defined in Eq. (13). $\begin{matrix}{{ASM} = {\frac{1}{N}{\sum\limits_{i = 1}^{N}{\sum\limits_{m}^{\quad}{\sum\limits_{n}^{\quad}\left\lbrack {{g_{i}\left( {m,n} \right)} - {\overset{\_}{g}\left( {m,n} \right)}} \right\rbrack^{2}}}}}} & (13)\end{matrix}$where${\overset{\_}{g}\left( {m,n} \right)} = {\frac{1}{N}{\sum\limits_{j = 1}^{M}{g_{j}\left( {m,n} \right)}}}$is the average of the N training image correlation surfaces. ASM is ameasure of distortions or dissimilarity (variations) in the correlationsurfaces relative to an average shape. In an ideal situation, allcorrelation surfaces produced by a distortion invariant filter (inresponse to a valid input pattern) would be the same, and ASM would bezero. In practice, reducing ASM improves the filter stability.

In addition to being distortion-tolerant, a correlation filter mustyield large peak values to facilitate detection. Towards this end, onemaximizes the filter's response to the training images on the average.However, no hard constraints are imposed on the filter's response totraining images at the origin. Rather, it is desired that the filtershould yield a large peak on the average over the entire training set.This condition is met by maximizing the average correlation height (ACH)metric defined in Eq. (14). $\begin{matrix}{{ACH} = {{\frac{1}{N}{\sum\limits_{i = 1}^{N}{x^{+}h}}} = {m^{+}h}}} & (14)\end{matrix}$where m is the mean of all vectors. It is desirable to reduce the effectof noise by reducing ONV. To make ACH large while reducing ASM and ONV,the filter is designed to maximize the metric in Eq. (15).$\begin{matrix}{{J(h)} = \frac{{{ACH}}^{2}}{{ASM} + {ONV}}} & (15)\end{matrix}$The filter which maximizes this metric is referred to as the maximumaverage correlation height (MACH) (A. Mahalanobis et al., Appl. Opt., 33(1994) 3751) filter.

The correlation filters previously described are presented as linearsystems whose response to patterns of interest is carefully controlledby various optimization techniques. The correlation filters may also beinterpreted as methods of applying transformations to the input data.Thus the correlation can be viewed as a linear transformation.Specifically, the filtering process can be mathematically expressed asmultiplication by a diagonal matrix in the frequency domain.

The distance of a vector x to a reference m_(k) under a linear transformH is given by $\begin{matrix}{d_{k} = {{{{Hx} - {Hm}_{k}}}^{2} = {\left( {x - m_{k}} \right)^{+}H^{+}{H\left( {x - m_{k}} \right)}}}} & (16)\end{matrix}$where superscript + denotes a conjugate transpose operation.

The filtering process transforms the input images to new images. For thecorrelation filter to be useful as a transform, it is required that theimages of the different classes become as different as possible afterfiltering. Then, distances can be computed between the transformed inputimage and the references of the different classes that have been alsotransformed in the same manner. The input is assigned to the class towhich the distance is the smallest. The emphasis is shifted from usingjust one point (i.e., the correlation peak) to comparing the entireshape of the correlation plane. These facts along with the simplifyingproperties of linear systems lead to a realization of a distanceclassifier in the form of a correlation filter. In the distanceclassifier correlation filter (DCCF) approach (A. Mahalanobis et al.,Appl. Opt., 35 (1996) 3127) the transform matrix H is found thatmaximally separates the classes while making all the classes as compactas possible.

While the various filters discussed can provide good results, thephysical implementation often requires complex computations and largememories. Thus, the need exists for a correlation filter that iscomputationally simple and that requires less memory than the existingart without compromising the results produced by the filter.

SUMMARY

In one embodiment, the present disclosure contemplates a method, whichcomprises obtaining a Fast Fourier Transform (FFT) of an image;obtaining an M-level quantization of one or more frequency samplescontained in the FFT of the image, wherein the M-level quantizationproduces a set of quantized values; and constructing a filter using theset of quantized values.

In another embodiment, the present disclosure contemplates a computersystem configured to perform the following: obtain a Fast FourierTransform (FFT) of an image; obtain an M-level quantization of one ormore frequency samples contained in the FFT of the image, wherein theM-level quantization produces a set of quantized values; and construct afilter using the set of quantized values. A data storage mediumcontaining the program code to enable the computer system to constructsuch a filter is also contemplated

In a further embodiment, the present disclosure contemplates a method tosynthesize a correlation filter. The method comprises: obtaining aplurality of images of a subject; building the correlation filter usinga first set of images from the plurality of images, wherein the firstset contains at least two of the plurality of images; cross-correlatingthe built correlation filter with a first image in a second set ofimages from the plurality of images, wherein the second set containsimages not contained in the first set of images and wherein thecross-correlation generates a first PSR (Peak-to-Sidelobe Ratio) value;and including the first image in a training set of images for thecorrelation filter when the first PSR value is less than a firstthreshold value, wherein the training set of images contains a subset ofimages from the plurality of images. The present disclosure furthercontemplates a program code and a computer system to execute thecorrelation filter synthesis method. The correlation filter may be aMACE (Minimum Average Correlation Energy) filter.

The present disclosure is directed generally to a correlation filterhaving reduced storage and computational complexity and increasedrecognition performance. The storage and computational complexity ofboth the filter and test and training images are reduced by quantizingthe filter (in the frequency domain) as well as the Fourier transform ofthe test image. With modern, fast, digital computers, digitalcorrelations can be implemented in real-time, thus allowing forreal-time quantization of the Fourier transform of the test image. Onepossible example (but not limited to) of such quantization is retainingthe phase (i.e., for example, setting the magnitude at all frequenciesto unity) and quantizing the phase to N levels. This quantization can bedifferent for the test image (N quantization levels) and filter (Mquantization levels), although typically M=N may be set (i.e., both thefilter and the Fourier transform of the test image are quantized to thesame number of levels). The quantization scheme can (if desired) bedifferent for each (or a group) of frequencies for each of the filter,test, and training array in the frequency domain. In many instances, itmay also be desirable to synthesize the filters from these reducedrepresentation training images (in the frequency domain), therefore onecan also quantize the Fourier transforms of the training images in thesame way, and then synthesize these correlation filters. An onlinetraining algorithm implemented on a face verification system isdescribed for synthesizing correlation filters to handle pose/scalevariations. A way to perform efficient face localization is alsodiscussed.

BRIEF DESCRIPTION OF THE DRAWINGS

For the present disclosure to be easily understood and readilypracticed, the present disclosure will now be described for purposes ofillustration and not limitation, in connection with the followingfigures, wherein:

FIG. 1 is a schematic of correlation-based filter design and patternrecognition;

FIGS. 2A and 2B illustrate correlation outputs for an authentic andimpostor, respectively;

FIG. 3 shows the region for estimating the peak-to-sidelobe ratio (PSR);

FIG. 4 is a flow chart illustrating an enrollment process according tothe present disclosure;

FIG. 5 is a flow chart illustrating a validation process using thereduced-complexity correlation filter of the present disclosure;

FIG. 6 illustrates sample images from the illumination subset of thePose, Illumination, Expressions (PIE) database with no backgroundlighting;

FIGS. 7A and 7B show the minimum average correlation energy (MACE)filter correlation output and the quad phase MACE (QP-MACE) filtercorrelation output using full phase correlation;

FIGS. 8A and 8B show the MACE filter correlation output and the QP-MACEfilter correlation output using a 4-level correlator;

FIGS. 9A, 9B show Peak-to-Sidelobe Ratio (PSR) comparison between MACEand Quad-Phase MACE filters using the 4-level correlator on the PIE-NL(no lights) dataset for Person 2;

FIGS. 10A, 10B show the Point Spread Function (PSF) of the fullcomplexity MACE filter and the PSF of the QP-MACE filter synthesized forPerson 2 for comparison;

FIG. 11 shows the correlation output resulting from the QPMACE Filter(synthesized from QP-FFT Training images) in conjunction with the4-level correlator;

FIG. 12 shows Point Spread Function of the Quad-Phase MACE filtersynthesized from the Quad-Phase training images (in the Fourier domain)of Person 2;

FIGS. 13A, 13B show PSR comparison between MACE and Quad-Phase MACEfilters using the 4-level correlator on the PIE-NL (no lights) datasetfor Person 2;

FIG. 14 is an exemplary flow diagram of an online training algorithm forsynthesizing correlation filters according to one embodiment of thepresent disclosure;

FIGS. 15A and 15B illustrate the face localization methodology accordingto one embodiment of the present disclosure; and

FIG. 16 illustrates exemplary hardware upon which the methods of thepresent disclosure may be practiced.

DETAILED DESCRIPTION

Reference will now be made in detail to some embodiments of the presentdisclosure, examples of which are illustrated in the accompanyingfigures. It is to be understood that the figures and descriptions of thepresent disclosure included herein illustrate and describe elements thatare of particular relevance to the present disclosure, whileeliminating, for the sake of clarity, other elements found in typicaldigital filters or correlation filters. It is noted at the outset thatthe terms “connected”, “connecting,” “electrically connected,” etc., areused interchangeably herein to generally refer to the condition of beingelectrically connected.

Correlation filters owe their origins to optical processing because the2-D (two dimensional) Fourier transform operation needed for performing2-D correlations could be naturally accomplished with properties oflight such as diffraction and propagation. However, given the currenthigh speeds at which fast Fourier transforms (FFTs) can be implemented,2-D correlations can be achieved relatively rapidly in current digitalimplementations. Therefore, the correlation operations discussedhereinbelow are intended for digital implementations. However, it isevident that the filters discussed hereinbelow may be implemented usingoptical processing as well.

Reduced-complexity correlation filters that maintain good correlationoutputs can be achieved by quantizing the Fourier transform of thetest/training images in conjunction with using a quantized correlationfilter. This scheme may yield large peak values at the origin in thecase of MACE type filters, thus providing excellent discriminationperformance. FIG. 4 illustrates the enrollment process. In FIG. 4, atstep 10, a training image is acquired. At step 12, a 2D Fast FourierTransform (FFT) is performed to produce a frequency array. At step 14,each entry in the array is quantized to M levels, as will be explainedin greater detail below. A filter can be constructed from the quantizeddata at step 16. At decision step 18, it is determined if more trainingimages are available. If so, the image is acquired, a 2D FFT isperformed, and the results quantized at steps 20, 22 and 24,respectively. The results are used to update the filter at step 26. Inthis manner, an existing filter can be updated or, in the case ofinitially building a filter, the filter can be incrementally constructedthereby reducing the amount of memory needed during the enrollmentphase. If more images are available as shown by step 28, the processrepeats. Otherwise, the process ends (block 30).

FIG. 5. illustrates the discrimination or test phase. The test image isacquired at step 40. A 2D FFT is performed (step 42) and the resultsquantized to N levels at step 44. The results from the test image aremultiplied by the filter produced by the process of FIG. 4 at step 46,and an inverse 2D FFT performed at step 48. The output of thecorrelation process is available at step 50. It is observed here thatFIG. 4 shows that the Fourier transform of the training image isquantized to M levels, and the resulting filter is quantized to P levels(albeit, typically, P is set equal to M (P=M)). However, FIG. 5 showsthat the test image is quantized to N levels. That is, the number ofquantization levels can be different for each spatial frequency (i.e.,N, and M can be a function of spatial frequency “f,” i.e., N(f), M(f)).In the special case where the same number of quantization levels is usedfor all frequencies, it is noted that very good face recognition resultsmay be achieved. In the following discussion, the results shown (withoutloss of generality) are based on uniform quantization across allfrequencies for both filter and Fourier transform of the input images.

Typically, but not limited to, only phase information is retained bysetting the magnitude to unity. Then, the phases are quantized to N andM levels for the Fourier transforms of the test image and filterrespectively. One example (but not limited to) is setting N=M so that inthe case where the test image matches the reference image, the phase ofthe Fourier Transform of the test image will cancel the phase of thefilter, thus yielding a large peak. It is shown below that for N=M=4,very good face verification results may be obtained, while reducing thecomplexity of performing this task considerably by having filters, andtest images, that only require (but not limited to) 2 bits/frequency forstorage.

One example (but not limited to) of applying this to the MACE filter—toimprove the filter quality without increasing the memory requirementssignificantly—is to use a 4-phase filter (with phases π/4, 3π/4, 5π/4,7π/4) defined below. To prevent the need for one filter for every view(as in the case of MSF, POF, QPOF), this phase quantization may beapplied to composite filters such as the MACE filter. The resulting4-phase MACE filter may be referred to as the Quad-Phase MACE (QPMACE)filter. The QPMACE filters require only a limited number of bits perpixel (e.g., two bits per pixel in the frequency domain) and can becalculated as follows: $\begin{matrix}{{H_{{QP} - {MACE}}\left( {u,v} \right)} = \left\{ \begin{matrix}{{{+ 1}\quad{Re}\left\{ {H_{MACE}\left( {u,v} \right)} \right\}} \geq 0} \\{{{- 1}\quad{Re}\left\{ {H_{MACE}\left( {u,v} \right)} \right\}} < 0} \\{{{+ j}\quad{Im}\left\{ {H_{MACE}\left( {u,v} \right)} \right\}} \geq 0} \\{{{- j}\quad{Im}\left\{ {H_{MACE}\left( {u,v} \right)} \right\}} < 0}\end{matrix} \right.} & (17)\end{matrix}$where the QP-MACE filter simply retains the sign bits from the real andimaginary components of the MACE filter respectively. That is, theQP-MACE filter may be represented by the following equation forsimplicity:H _(QP-MACE)(u,v)=Sgn[Re{H _(MACE)(u,v)}]+jSgn[Im{H_(MACE)(u,v)}]  (17A)Where Sgn[x] is “+1” for x greater than or equal to 0, and “−1” for x<0.Thus, it is clear that the QP-MACE filter simply takes on (±1) and (±j)sign bits.

To test the performance of these QPMACE filters for the application offace verification, the Pose, Illumination, Expressions (PIE) databasecollected at the Robotics Institute at Carnegie Mellon University wasused. The subset used was the illumination subset, which contains 65people with approximately 21 images each, captured under two differentconditions; one capture session was captured with the lights on (it isreferred to herein as the PIE-L subset), and the other was captured withroom lights off (it is referred to herein as the PIE-NL subset). FIG. 6shows a set of example images from person 2.

It is observed that using just QPMACE filter alone does not provide goodcorrelation outputs that are similar to the correlation outputs of afull complexity MACE filter. The exemplary correlation outputs from theMACE filter (FIG. 7A) and the QPMACE filter (FIG. 7B) using full-phaseFourier transform test image are obtained when the test input image isone of the training images from which these filters are designed. Asexpected, the MACE filter yields a sharp correlation peak (peakvalue=1.00). However, the correlation output from the QPMACE filter hasa less visible peak (peak value=0.08), which is undesirable. Further,the resulting PSR (=9.91) for the QPMACE filter is low compared to thePSR of 66.07 for the full complexity MACE filter. Note that even whenQPMACE is being used, the Fourier transform of the input test image cantake on all possible phase values and thus it may be referred to as thefull phase correlation. It is noted here that the although thediscussion herein uses a PSR value as a figure-of-merit (FOM), thediscussion is not constrained to use of just the PSR values. Instead,the methodology according to the present disclosure may be used inconjunction with any other computation that produces a similar suitablefigure-of-merit.

In an effort to understand why the full phase correlation using QPMACEfilters is less than satisfactory, it may be inferred that to get asharp correlation peak, the product of the Fourier transform of the testimage and the filter frequency response must yield something close to aconstant value (except for a linear phase term). This may only happen ifthe phases of the filter frequency response and input image Fouriertransform cancel out to yield a constant magnitude.

In the full-complexity MACE filters, both the filter and the input imageFourier transform can take on all possible values and thus there is nodifficulty in phases canceling out (assuming that there is no inputnoise). On the other hand, in the full phase correlation using theQPMACE filters, the filter may take on only one of four phase values(namely π/4, 3π/4, 5π/4, 7π/4), whereas the input image Fouriertransform takes on all possible values. The product of these two valuesmay not be a phase-free term and thus the resulting correlation peak maynot be as sharply peaked as desired.

The foregoing suggests that it may be beneficial to quantize the phaseof the input image Fourier transform to four phase values just as inQPMACE design. In such a case, the filter phase and the input Fouriertransform phase are more likely to cancel each other, leading to a sharpcorrelation peak. This process may be referred to as a 4-levelcorrelator (essentially quantizing the Fourier transform of the testinput image to 2 bits/pixel used in conjunction with a QPMACE filter).The correlation outputs resulting from the 4-level correlator are shownin FIGS. 8A and 8B, where FIG. 8A shows an exemplary correlation outputfrom the MACE filter and FIG. 8B shows exemplary correlation output fromthe QPMACE filter using a 4-level correlator. It is seen that theoutputs in FIGS. 8A and 7A are identical, however the peak issignificantly sharper in the output in FIG. 8B as compared to thecorrelation output in FIG. 7B where the QP-MACE filter was used withfull phase correlation.

It is noted here that the terms “2 bits/pixel” and “2 bits/frequency”are used synonymously in the discussion herein. Both of these terms meanthe same in the sense that an N×N image has an FFT of size N×N (i.e.,the number of pixels is the same as the number of frequencies). Also, itis known in the art that sometimes each frequency is just called a“pixel” with the implicit understanding that a frequency domain array isbeing referred to.

In an experiment, a single MACE filter was synthesized using 3 trainingimages (depicting extreme light variation) for each person. Images no.3, 7, 16 were used from each person to synthesize their MACE filter.This was repeated to synthesize QP-MACE filters. Each filter was thencross-correlated with the whole database to examine the verificationperformance. As noted before, the PIE database contains images of 65people each with 21 images captured under varying illuminationconditions. The face images were extracted and normalized for scaleusing selected ground truth feature points provided with the database.The resulting face images used in the experiments were of size 100×100pixels. Thus, in this experiment, the same image numbers were selectedfor every one of the 65 people, and a single MACE filter was synthesizedfor each person from those images using equation (7) and similarly areduced memory QPMACE filter was also synthesized using equation (17).For each person's filter, a cross-correlation was performed with thewhole dataset (65*21 =1365 images), to examine the resulting PSRs forimages from that person and all the other impostor faces. This processwas repeated for all people (a total of 88,725 cross-correlations), foreach of the two illumination datasets (with and without backgroundlighting).

FIG. 9A and FIG. 9B show the Peak-to-Sidelobe Ratio (PSR) plots forperson 2 from the PIE-L (FIG. 9A) and the PIE-NL (FIG. 9B) databases,respectively, when using the full complexity MACE filter as well as the4-level correlator using QPMACE filters. It is seen from these figuresthat while there is some loss of PSR performance when using the QPMACEfilter, it is small. In fact, the 4-level correlator employing QPMACEfilters still achieved 100% verification on both PIE-NL and PIE-Ldatabases (for all 65 people), achieving same verification performanceas the full complexity MACE filter. This can be seen from the fact thePSR curves for the authentics (the top two plots in FIGS. 9A and 9B) andthe impostors (the bottom two plots in FIGS. 9A and 9B depicting themaximum impostor PSR among all impostors) can be completely separated bya single threshold.

The three distinct peaks shown in FIG. 9B are those that belong to the 3training images (#3, #7, #16) used to synthesize person 2's filter. Itis observed that while there is a degradation in PSR performance usingQP-MACE filters, this degradation is non-linear, i.e., the PSR degradesmore for very large PSR values (but still provides a large margin ofseparation from the impostor PSRs) of the full complexity MACE filters,but this may not be the case for low PSR values resulting from theoriginal full complexity MACE filters. It is observed from FIGS. 9A and9B that the impostor PSRs which are in the 10 PSR range and below,QPMACE filter achieves very similar performance as the full-phase MACEfilters.

Another observation that was consistent throughout all 65 people is thatthe impostor PSRs are consistently below some threshold (e.g., 12 PSR).This observed upper bound was irrespective of illumination or facialexpression change. This property may make MACE type correlation filtersideal for verification because a fixed global threshold may be selected(above which the user is authorized) which is irrespective of what typeof distortion occurs, and even irrespective of the person to beauthorized. In contrast, however, this property does not hold in otherapproaches such as traditional Eigenface or IPCA methods, whose residueor distance to face space is highly dependent on illumination changes.

Some benefits of frequency domain quantization can be seen by examiningthe point spread functions (PSF) of the full-phase MACE filter and theQPMACE filter in FIG. 10A and FIG. 10B, respectively. PSF is the inverseFourier transform of the filter. Clearly, the visible face features arepreserved in the QPMACE filter. In fact, the QPMACE point spreadfunction looks more like the original face (except for thepositive/negative template values). This is because QPMACE filter is anall-pass filter, allowing equal contributions from the low frequency(more salient features of the face) and high frequency components. Incontrast, full-complexity MACE filters emphasize higher frequencycomponents to obtain a near flat spectrum and in the process emphasizeshigher frequency features (therefore, only edge outlines of mouth, nose,eyes, and eye brows are visible). A majority of illumination variationsaffect the lower spatial frequency content of images, and thesefrequencies are attenuated by the MACE filters, hence the output isunaffected. Shadows, for example, will introduce new features that havehigher spatial frequency content, however, MACE filters look at thewhole image and do not focus at any one single feature, thus these typesof filters provide a graceful degradation in performance as moredistortions occur.

It is noted that QPMACE filters may be produced not only by quantizingthe final full phase MACE filters to the exemplary 4 phase levels {¼π,¾π, {fraction (5/4)}π, {fraction (7/4)}π}, but by also synthesizing thequad phase MACE filter using quad phase only versions of the Fouriertransforms of training images. This has important implications inmemory-limited platforms that operate with limited precision, where thismight be a computationally attractive solution to storing the trainingimages in frequency domain for synthesizing these filters. In thisspecific example of using MACE filters, the user typically (but notnecessarily) may store all the training images before synthesizing theMACE filter. Because phase information captures most of the imageinformation, the user may synthesize QPMACE filters from quad-phase onlyFourier transforms (QP-FFT) of training images. When a 4-levelcorrelator is used, it may be reasonable to conclude that using the sametype of quantized Fourier transforms of input images during testingmight in fact improve performance. In general the quantized MACE filterwould be synthesized from quantized Fourier transform training images.FIG. 11 shows the sharp correlation peaks (peak value=1.08, PSR=52.91)from the QPMACE filter designed from the quad phase only Fouriertransforms of the training images. The point spread function of theQPMACE filter synthesized from the QP-FFT training images is shown inFIG. 12, which illustrates that this filter is able to retain the mainfacial structure. In the PSR plots in FIG. 13A and FIG. 13B, the dottedcurves represent the QPMACE filters synthesized from QP-FFT trainingimages and yield higher PSRs for some of the images than the QPMACEfilters.

In the example above, the reduced complexity scheme was applied to MACEfilter. However, the reduced complexity scheme may be applied to any ofthe other correlation filter designs. In particular, from acomputational viewpoint, the unconstrained MACE (UMACE) filter may beconsidered to achieve verification performance similar to the MACEfilter. Further, UMACE filters are significantly easier to constructthan the MACE filter as the cost for building UMACE filters increaseslinearly with the number of images; i.e., as more images are added tothe training set, there are more vectors that have to be added tocompute the mean and average power spectrum. Clearly, one advantage ofusing UMACE filters is that one can build these filters incrementally inan easy fashion. That is, given a new training image, one only needs toupdate the mean image vector m and the average power spectrum matrix D.

From equation (11), it is seen that since D is a diagonal matrix, theelements of the mean image in the frequency domain are divided by theelements of D along its diagonal. Therefore, it is not necessary todivide by the number of training images to form the mean image and theaverage power spectrum as the scalar divisor cancels out as follows.Thus, UMACE filter formulation can be simplified even further.h _(UMACE) =D ⁻¹ m$\begin{matrix}{= {{\left\lbrack {\frac{1}{N/}{\sum\limits_{i = 1}^{N}\left( {X_{i}^{\prime}X_{i}^{\prime*}} \right)}} \right\rbrack^{- 1}\frac{1}{N/}{\sum\limits_{i = 1}^{N}X_{i}}} = {D^{\prime}m^{\prime}}}} & (18)\end{matrix}$where X_(i)′ is a diagonal matrix containing the Fourier transform ofthe ith training image lexicographically re-ordered and placed along thediagonal. Similarly, x_(i) contains the same Fourier transformre-ordered into a 1-D column vector.

In general, the mean image vector m and the average power spectrummatrix D would be in the following form. $\begin{matrix}{D_{n + 1}^{\prime} = \frac{{n\quad D_{n}^{\prime}} + {X_{n + 1}^{\prime}X_{n + 1}^{\prime*}}}{n + 1}} & (19) \\{m_{n + 1}^{\prime} = \frac{{n\quad m_{n}^{\prime}} + x_{n + 1}}{n + 1}} & (20)\end{matrix}$Thus, given a new image X_(n+1), one can incrementally update using thefollowing simplified recurrent equations for m and D for the UMACEfilter.D _(n+1) ′=D _(n) ′+X _(n+1) ′X _(n+1)′*  (21)m _(n+1) =m _(n) ′+x _(n+1)  (22)where X_(n+1)′ is a diagonal matrix containing the Fourier transform ofthe training image at time step (n+1), lexicographically re-ordered andplaced along the diagonal. Therefore, for incrementally synthesizing aQuad Phase UMACE filter, one can simplify the update process furthershowing that one only needs to store and update the mean image or simplym_(n+1)′ because as the power spectrum is positive it will not affectthe sign of the elements in the UMACE filter in equation (18).Therefore, the same QP-UMACE filters are formed. This would beadvantageous for use with the quantized 2 bit/frequency FFT trainingimages as the dynamic range is limited with the number of trainingimages used to synthesize the filter, one can get away without having toperform divisions by N, the number of training images.

It is noted that one can also form QPUMACE filters in the same fashionas before. $\begin{matrix}{{H_{QPUMACE}\left( {u,v} \right)} = \left\{ \begin{matrix}{{{+ 1}\quad{{Re}\left( H_{UMACE} \right)}} \geq 0} \\{{{- 1}\quad{{Re}\left( H_{UMACE} \right)}} < 0} \\{{{+ j}\quad{{Im}\left( H_{UMACE} \right)}} \geq 0} \\{{{- j}\quad{{Im}\left( H_{UMACE} \right)}} < 0}\end{matrix} \right.} & (23)\end{matrix}$That is, similar to equation (17A), the QPUMACE filter may berepresented by the following equation for simplicity:H _(QPUMACE)(u,v)=Sgn[Re{H _(UMACE)(u,v)}]+jSgn[Im{H_(UMACE)(u,v)}]  (23A)Where Sgn[x] is “+1” for x greater than or equal to 0, and “−1” for x<0.One can simplify the computation of synthesizing reduced complexityUMACE filters in the special case of using 4 phase levels. Examining theUMACE filter in Eq. (11), it is seen that dividing the mean m by thediagonal elements of the average power spectrum matrix D (which containsonly positive diagonal elements) cannot change the signs of the realpart and the imaginary part of m (which is what is required to computethe QPUMACE as shown in Eq. (23A)). Thus, matrix D may not affect theresulting QPUMACE filter and may not have to be updated or computed.Therefore, for QPUMACE filters only, the filter synthesis simplifies tothe following in Eq. (24). Thus, one only needs to store and update themean vector m. $\begin{matrix}{{H_{QPUMACE}\left( {u,v} \right)} = \left\{ \begin{matrix}{{{+ 1}\quad{{Re}(m)}} \geq 0} \\{{{- 1}\quad{{Re}(m)}} < 0} \\{{{+ j}\quad{{Im}(m)}} \geq 0} \\{{{- j}\quad{{Im}(m)}} < 0}\end{matrix} \right.} & (24)\end{matrix}$The above equation (Eq. (24)) may be represented by the followingequation for simplicity:H _(QPUMACE)(u,v)=Sgn[Re{m(u,v)}]+jSgn[Im{m(u,v)}]  (24A)Where, as before, Sgn[x] is “+1” for x greater than or equal to 0, and“−1” for x<0. This shows that for the special case of using 4-phaselevels, the filter can be formulated directly by looking at the signbits of the average Fourier transform of the training images.

Similarly, a more efficient system using QPUMACE filter can also besynthesized from 2 bits/frequency training images, thus saving onworkspace memory.

In this very special case of using 4 phase levels, it can be generalizedthat all unconstrained correlation filters (e.g., UMACE, UOTSDF, MACH)result in the same reduced complexity filter as given in Eq. (24), i.e.,that the reduced complexity filter only needs the sign bits of theaverage Fourier transform of the training images. This is of greatsignificance as it eliminates the need to decide which filter type touse (among the UMACE, UOTSDF, MACH) while providing a significantreduction in design complexity and still retaining very good recognitionperformance when used in conjunction with the 4-level correlator.

It is observed that there are many application scenarios where trainingand recognition is to be performed with limited computational resources.The following discussion concentrates on how to synthesize constrainedMACE filters efficiently by using computationally efficient methods ofcomputing the inverse of (X⁺D⁻¹X)⁻¹ needed in the MACE formulation.Examining the MACE filter equation (7), it can be rewritten in thefollowing form. $\begin{matrix}{\begin{matrix}{h = {D^{- 0.5}D^{- 0.5}{X\left( {X^{+}D^{- 0.5}D^{- 0.5}X} \right)}^{- 1}u}} \\{= {{D^{- 0.5}\left( {D^{- 0.5}X} \right)}\left( {\left( {D^{- 0.5}X} \right)^{+}\left( {D^{- 0.5}X} \right)} \right)^{- 1}u}} \\{= {D^{- 0.5}{X^{\prime}\left( {X^{\prime +}X^{\prime}} \right)}^{- 1}u}}\end{matrix}{where}} & (25) \\{X^{\prime} = {D^{- 0.5}X}} & (26)\end{matrix}$It is seen from the above that X′ is nothing but the original Fouriertransformed training images X pre-whitened by the average power spectrumD.

Writing in the format given in equation (25) allows one to form analternative way of incrementally computing the inverse of theinner-product matrix (also commonly referred to as the Gram matrix)(X′⁺X′) as given below. $\begin{matrix}{\left( {X_{t}^{\prime T}X_{t}^{\prime}} \right)^{- 1} = {\begin{bmatrix}\left( {X_{t - 1}^{\prime T}X_{t - 1}^{\prime}} \right)^{- 1} & 0^{T} \\0^{T} & 0\end{bmatrix} + \ldots + {{k_{t}^{- 1}\begin{bmatrix}{\left( {X_{t - 1}^{\prime\quad T}X_{t - 1}^{\prime}} \right)^{- 1}X_{t - 1}^{\prime\quad T}x_{t}} \\1\end{bmatrix}}\left\lbrack {{- \left( {X_{t - 1}^{\prime\quad T}x_{t}} \right)^{T}}\left( {X_{t - 1}^{\prime\quad T}X_{t - 1}^{\prime}} \right)^{- 1}\quad 1} \right\rbrack}}} & (27)\end{matrix}$where the scalar constant k_(t) is defined ask _(t) =x _(t) ^(T) x _(t)−(X _(t−1)′^(T) x _(t))^(T)(X _(t−1)′^(T) X_(t−1)′)⁻¹(X _(t−1)′^(T) x _(t))  (28)

Equations (27) and (28) can be computationally simpler to compute thanthe direct inverse for large number of training images. The only otherconstraint that must be satisfied to compute the matrix inverse is thatthe training image at time instant “t” must not be a linear combinationof any previous training images as this may result in X′_(t) ^(T) X′_(t)being singular. Therefore, it may be desirable to test whether thedeterminant of X′_(t) ^(T) X′_(t) is non-zero with the addition of eachnew training image. It is noted that the pre-whitening step does notaffect the linear independence of the column space, hence, one canequivalently test the determinant of X′_(t) ^(T) X′_(t) if desired.

Computing the determinant using standard techniques is very expensive.Therefore, a more efficient way to test for linear independence may bedesirable. The Gram matrix X′_(t) ^(T) X′_(t) has a special structurethat can be exploited to formulate an efficient iterative method tocompute the determinant as new images are collected. $\begin{matrix}{{X_{t}^{\prime\quad T}X_{t}^{\prime}} = {{\begin{bmatrix}X_{t - 1}^{\prime\quad T} \\x_{t}^{\prime\quad T}\end{bmatrix}\left\lbrack {X_{t - 1}^{\prime}\quad x_{t}^{\prime}} \right\rbrack} = \begin{bmatrix}{X_{t - 1}^{\prime\quad T}X_{t - 1}^{\prime}} & {X_{t - 1}^{\prime\quad T}x_{t}^{\prime}} \\{x_{t}^{\prime\quad T}X_{t - 1}^{\prime}} & {x_{t}^{\prime\quad T}x_{t}^{\prime}}\end{bmatrix}}} & (29)\end{matrix}$It is noted that the constant term k_(t) in equation (28) is known asthe Schur complement of the partitioned Gram matrix X′_(t) ^(T) X′_(t)given above in equation (29). Assuming that X′_(t−1) has linearlyindependent columns, then the linear combination vector e_(t) will bezero vector only if the image x_(t) is a linear combination of the othertraining images.e _(t) =[X′ _(t−1) x _(t) ]a _(t)  (30)The norm squared of the error e_(t) is (e_(t) ^(T) e_(t)), which is alsoknown as the Schur complement of X′_(t) ^(T) X′_(t) can be computed asshown as k_(t) in equation (28).

As discussed in Louis L. Scharf, “Statistical SignalProcessing-Detection, Estimation, and Time-Series Analysis”,Addison-Wesley Publishing Company (1991), it can be shown that thedeterminant of the Gram matrix can be iteratively computed as follows:$\begin{matrix}{{\det\left( {X_{t}^{\prime\quad T}X_{t}^{\prime}} \right)} = {{k_{t}{\det\left( {X_{t - 1}^{\prime\quad T}X_{t - 1}^{\prime}} \right)}} = {\prod\limits_{t = 1}^{t}k_{t}}}} & (31)\end{matrix}$The det(X′_(t−1) ^(T) X′_(t−1)) is nonzero only if k_(t) is non-zero.Therefore, one may only need to compute and test if the Schur complementof the augmented Gram matrix X′_(t) ^(T) X′_(t) is non-zero with theaddition of each new training image. This is a computationally moreefficient way to test for linear independence given a new trainingimage.Online Training Scheme for Synthesizing Distortion Tolerant CorrelationFilters

The performance of any recognition system may depend heavily on thechoice of training images. The following discussion relates to anonline-training algorithm for synthesizing correlation filters. FIG. 14is an exemplary flow diagram of the online training algorithm forsynthesizing correlation filters, preferably MACE filters, according toone embodiment of the present disclosure. Although the discussion isprovided in the context of enrolling a person in a face authenticationsystem, it is noted that this scheme may be used for any otherapplications as well. Further, although the discussion focuses on MACEfilters, the training image selection method outlined in FIG. 14 may beused with any other correlation filter. The advanced correlation filtersdescribed hereinbefore are attractive because they allow one tosynthesize a single filter template from multiple training images. Aquestion that can arise is: what is the maximum limit of training imagesthat can be used to synthesize a single filter? It is observed thatcurrently there is no clear way to quantify this because the number oftraining images depends on the amount of variability present in thesetraining images. As pointed out earlier, MACE filters perform acorrelation energy minimization. Therefore, it may be logical to expectthat training images with a larger range of variability will synthesizefilters that are not able to minimize the average correlation energy incomparison to training sets with smaller variations. Thus, the resultingPSRs are smaller for the filter synthesized from a larger variationtraining set. This includes the PSRs obtained even from the trainingimages used to synthesize the filter.

Therefore one possible way to quantify the quality of a MACE filter isto measure the PSRs (or a similar figure-of-merit) resulting from eachof the training images that were used in synthesis. Step 64 in FIG. 14depicts this PSR computation process. If these PSRs (i.e., the minimumvalues of these PSRs) are smaller than some threshold θ₂ (step 66), thenthis filter is not capable of performing well, as one expects the PSRsresulting due to test images to be less than that of the trainingimages. If the training image PSRs are low to begin with, then thefilter may not be able to provide sufficient discrimination. A newtraining image may be selected in this case (step 68), otherwise theimage M may remain included in the current training set if the conditionat step 66 is not satisfied.

During the enrollment process, one may have collected a video stream offace images, and the assumption one can make is that the difference inimages between successive frames (assuming a reasonable capture framerate) is not great. Thus one can build a filter from a couple of images(step 52), by obtaining the next frame (block 54) and cross-correlatingit with the synthesized filter (step 56). If the computed PSR (i.e., themaximum PSR obtained from all the synthesized filters) is smaller thansome threshold θ₁ (step 58), then that means that the current face imageis not represented by the synthesized filters. Therefore, it may bedesirable to include the current image in the current training set (step60) and re-synthesize or update the current filter (step 62) using thenewly-added image in the training set. After each image is added to thetraining set, the quality of the updated filter may be tested asdescribed before (steps 64, 66, and 68 in FIG. 14). If the filter hasreached its maximum capacity, one may start building a new filter, usingthe same scheme described hereinabove with reference to FIG. 14. It isnoted here that the values of thresholds θ₁ and θ₂ may be flexiblyselected at run-time or may be predetermined depending on the type ofpattern recognition application and the desired verification accuracy.

While it was shown hereinbefore that MACE filters are tolerant toillumination changes, handling pose changes is a tougher problem. Inefforts to produce sharp correlation peaks, these advanced correlationfilters emphasize higher spatial frequencies in the images, therebycapturing the relative geometrical structure of the facial featureswhile ignoring the lower spatial frequencies, which are affected themost by illumination conditions. Thus, expected poses of the faceimages, including scale changes, need to be shown as examples to thesefilters to correctly classify them during verification. This may be doneusing an instructive guide in the system, asking the user to exhibitvarious pose changes as the online-training algorithm is running. Thefinal enrollment process period may then be dependent on how manyfilters can be stored, and how much variation is exhibited by the user.

Face Localization

In the authentication process, the user is asked to cooperate and placehis face in front of the camera. To not constrain the user and forpurposes of increasing the speed of the overall verification process, itmay be desirable to implement a face localizer which locates the faceand centers it for the classification. While correlation-filters areshift invariant, it may still be needed to provide the full face imagefor reliable verification. For verification purposes, correlationfilters need to have a near full face image to perform well and achievea PSR score above a specific threshold. This may be harder to achieveespecially when other distortions, such as pose variations, are present.Therefore, it may be desirable to add a pre-processing step to locatepartial face in view of the camera, to automatically capture and processa full face image. When a face image is captured, it is correlated withthe stored set of correlation filters, and the maximum PSR achieved isstored along with the location of the correlation peak. This peaklocation may tell how much the face image is shifted. Thus, one can usethe peak location to select and crop out the correct face region. Thecaptured image resolution is typically much higher than the resolutionof the face images used for verification. In one embodiment, a region ofthe captured scene is cropped out, and the image region is downsampledto the resolution desired for performing face verification. Thedownsampling process of the selected face region in scene allows one tosmooth out camera noise by a form of pixel averaging (in comparison todirectly capturing a very low resolution image). Also, more importantly,one can first locate the position of the face in the smaller resolutionimage and estimate the correct face region in the high resolutionbackground image, and then shift the crop window and downsample theestimated region containing the face, and then perform verification.This may be more computationally efficient than performingcross-correlation of a face template on a higher resolution backgroundimage to first locate the face, then downsample and performverification. FIGS. 15A and 15B illustrate the face localizationmethodology according to one embodiment of the present disclosure. FIG.15A shows an exemplary captured face image with face location markedwith a white cross (the white cross denotes the position of correlationpeak). On the other hand, FIG. 15B shows the face in FIG. 15A centeredbased on the peak location. The full face image in FIG. 15B is capturedbased on estimated face location. Face identification can be performedby simply enrolling many people in the verification system.

It is noted here that the term “biometric verification” (1:1 matching)refers to matching the live biometric to the stored biometric of aclaimed identity, whereas the term “biometric identification” (1:Nmatching) refers to best match among N stored biometrics for a livebiometric. The face recognition process may usually encompass both faceverification and face identification. It is noted here that the facelocalization discussion given above applies to both face verification aswell as face identification (or to any other biometric verification andidentification process). For verification, it is checked to see if PSRexceeds a threshold or not. On the other hand, for identification, NPSRs may be computed for the N filters and the input image may beassigned to that filter which yields the largest PSR.

FIG. 16 illustrates one type of apparatus, PDA (Personal DigitalAssistant) 70, upon which various methods of the present disclosure,e.g., the methods of FIGS. 4 and 5, may be practiced. Other types ofhardware, e.g., cell phones, digital cameras, and the like may also beused as platforms for the present invention. In FIG. 16, the PDA 70 hasa touch sensitive screen 72. The PDA 70 may also have one or more of thefollowing: a camera or other imaging device 74, electronics forestablishing a wireless internet connection 76, a microphone 78, and aspeaker 80. The PDA 70 may carry a memory (e.g., a flash memory) (notshown) for storing the software necessary to implement various processesdescribed hereinbefore (e.g., the processes shown in FIGS. 4, 5, and14). Alternatively, the PDA 70 may access such software, when needed,through wired or wireless internet connection 76. The software may alsobe stored in a portable data storage medium (e.g., a compact disc, afloppy diskette, a microdrive, a USB (Universal Serial Bus) memorydevice, or a digital data storage medium) (not shown) for installationand execution in the PDA 70 or other data processing or computingdevice. The apparatus shown in FIG. 16 is intended to be exemplary onlyand not limiting.

The foregoing describes a methodology to reduce the complexity (memoryrequirement of only 2 bits/pixel in frequency domain) of correlationfilters for face recognition. Reduced-complexity correlations areachieved by having quantized MACE, UMACE, OTSDF, UOTSDF, MACH, and otherfilters, in conjunction with a quantized Fourier transform of the inputimage. This reduces complexity in comparison to the advanced correlationfilters using fall-phase correlation. However, the verificationperformance of the reduced complexity filters is comparable to that offull-complexity filters. A special case of using 4-phases to representboth the filter and training/test images in the Fourier domain leads tofurther reductions in the computational formulations as shown, forexample, in the case of unconstrained correlation filters (leading torequiring only the storage and update of the mean Fourier transform ofthe training images in incremental updating). This also enables thestorage and synthesis of filters (e.g., MACE filters) in limited-memoryand limited-computational power platforms such as PDAs, cell phones,etc. An online training algorithm implemented on a face verificationsystem is described for synthesizing correlation filters to handlepose/scale variations. A way to perform efficient face localization isalso discussed.

It is noted here that although the discussion given hereinabove has beenwith reference to correlation filters, the quantized FFT valuesaccording to present methodology may be used for any other filters. Thecorrelation filters are discussed herein because of their widespread usein pattern recognition applications where filters of interest are thosethat produce correlation outputs. However, other non-correlation filtersand their corresponding applications may also be configured to utilizethe quantization methodology according to the present disclosure.

While the disclosure has been described in detail and with reference tospecific embodiments thereof, it will be apparent to one skilled in theart that various changes and modifications can be made therein withoutdeparting from the spirit and scope of the embodiments. Thus, it isintended that the present disclosure cover the modifications andvariations of this disclosure provided they come within the scope of theappended claims and their equivalents.

1. A method, comprising: obtaining a Fast Fourier Transform (FFT) of afirst image; obtaining an M-level quantization of one or more frequencysamples contained in said FFT of said first image, wherein said M-levelquantization produces a first set of quantized values; and constructinga filter using said first set of quantized values.
 2. The method ofclaim 1, further comprising: obtaining said FFT of a second image;obtaining said M-level quantization of one or more frequency samplescontained in said FFT of said second image, thereby producing a secondset of quantized values; and updating said filter using said second setof quantized values.
 3. The method of claim 1, further comprising:obtaining said FFT of a second image; obtaining an N-level quantizationof one or more frequency samples contained in said FFT of said secondimage, wherein said N-level quantization produces a second set ofquantized values; and applying said filter to said second set ofquantized values.
 4. The method of claim 3, wherein applying said filterincludes: multiplying said filter with each quantized value in saidsecond set of quantized values so as to generate a set of multipliedvalues; and performing an inverse FFT on said set of multiplied values.5. The method of claim 3, wherein M=N.
 6. The method of claim 3, whereinthe value of N varies with the frequency of the frequency sample of saidsecond image being quantized.
 7. The method of claim 1, wherein thevalue of M varies with the frequency of the frequency sample beingquantized.
 8. The method of claim 1, further comprising storing saidfilter in an electronic memory using two bits per frequency for storage.9. The method of claim 1, further comprising: applying said filter tosaid first set of quantized values.
 10. The method of claim 9, whereinapplying said filter includes: multiplying said filter with eachquantized value in said first set of quantized values so as to generatea set of multiplied values; and performing an inverse FFT on said set ofmultiplied values.
 11. The method of claim 1, wherein said filter is oneof the following types: a minimum average correlation energy (MACE)filter; an unconstrained MACE (UMACE) filter; an optimal trade-offsynthetic discriminant function (OTSDF) filter; an unconstrained OTSDF(UOTSDF) filter; and a maximum average correlation height (MACH) filter.12. The method of claim 1, wherein obtaining said M-level quantizationof one or more frequency samples includes selecting M phase values ofsaid one or more frequency samples to represent M quantization levels insaid M-level quantization.
 13. A data storage medium containing programcode, which, when executed by a processor, causes said processor toperform the following: obtain a Fast Fourier Transform (FFT) of a firstimage; obtain an M-level quantization of one or more frequency samplescontained in said FFT of said first image, wherein said M-levelquantization produces a first set of quantized values; and construct afilter using said first set of quantized values.
 14. The data storagemedium of claim 13, wherein said program code, upon execution, causessaid processor to further perform the following: obtain an FFT of asecond image; obtain an M-level quantization of one or more frequencysamples contained in said FFT of said second image, thereby producing asecond set of quantized values; and update said filter using said secondset of quantized values.
 15. The data storage medium of claim 13,wherein said program code, upon execution, causes said processor tofurther perform the following: obtain an FFT of a second image; obtainan N-level quantization of one or more frequency samples contained insaid FFT of said second image, wherein said N-level quantizationproduces a second set of quantized values; and apply said filter to saidsecond set of quantized values.
 16. The data storage medium of claim 15,wherein said program code, upon execution, causes said processor toapply said filter to said second set of quantized values by performingthe following: multiplying said filter with each quantized value in saidsecond set of quantized values so as to generate a set of multipliedvalues; and performing an inverse FFT on said set of multiplied values.17. The data storage medium of claim 13, wherein said program code, uponexecution, causes said processor to store said filter in an electronicmemory using two bits per frequency for storage.
 18. The data storagemedium of claim 13, wherein said program code, upon execution, causessaid processor to obtain said M-level quantization by selecting M phasevalues of said one or more frequency samples to represent M quantizationlevels in said M-level quantization.
 19. A computer system configured toperform the following: obtain a Fast Fourier Transform (FFT) of a firstimage; obtain an M-level quantization of one or more frequency samplescontained in said FFT of said first image, wherein said M-levelquantization produces a first set of quantized values; and construct afilter using said first set of quantized values.
 20. The computer systemof claim 19 configured to further perform the following: obtain an FFTof a second image; obtain an M-level quantization of one or morefrequency samples contained in said FFT of said second image, therebyproducing a second set of quantized values; and update said filter usingsaid second set of quantized values.
 21. The computer system of claim 19configured to further perform the following: obtain an FFT of a secondimage; obtain an N-level quantization of one or more frequency samplescontained in said FFT of said second image, wherein said N-levelquantization produces a second set of quantized values; multiply saidfilter with each quantized value in said second set of quantized valuesso as to generate a set of multiplied values; and perform an inverse FFTon said set of multiplied values.
 22. The computer system of claim 19configured to store said filter in an electronic memory using two bitsper frequency for storage.
 23. The computer system of claim 19configured to obtain said M-level quantization by selecting M phasevalues of said one or more frequency samples to represent M quantizationlevels in said M-level quantization.
 24. A system comprising: means forobtaining a Fast Fourier Transform (FFT) of a first image; means forobtaining an M-level quantization of one or more frequency samplescontained in said FFT of said first image, wherein said M-levelquantization produces a first set of quantized values; and means forconstructing a filter using said first set of quantized values.
 25. Thesystem of claim 24, further comprising: means for obtaining an FFT of asecond image; means for obtaining an M-level quantization of one or morefrequency samples contained in said FFT of said second image, therebyproducing a second set of quantized values; and means for updating saidfilter using said second set of quantized values.
 26. The system ofclaim 24, further comprising: means for obtaining an FFT of a secondimage; means for obtaining an N-level quantization of one or morefrequency samples contained in said FFT of said second image, whereinsaid N-level quantization produces a second set of quantized values;means for multiplying said filter with each quantized value in saidsecond set of quantized values so as to generate a set of multipliedvalues; and means for performing an inverse FFT on said set ofmultiplied values.
 27. The system of claim 24, further comprising meansfor storing said filter in an electronic memory using two bits perfrequency for storage.
 28. A method to synthesize a correlation filter,said method comprising: obtaining a plurality of images of a subject;building said correlation filter using a first set of images from saidplurality of images, wherein said first set contains at least two ofsaid plurality of images; cross-correlating said built correlationfilter with a first image in a second set of images from said pluralityof images, wherein said second set contains images not contained in saidfirst set of images and wherein said cross-correlation generates a firstPSR (Peak-to-Sidelobe Ratio) value; and including said first image in atraining set of images for said correlation filter when said first PSRvalue is less than a first threshold value, wherein said training set ofimages contains a subset of images from said plurality of images. 29.The method of claim 28, further comprising: building a first updatedversion of said correlation filter using said first image in saidtraining set; and testing the quality of said first updated version ofsaid correlation filter.
 30. The method of claim 29, wherein saidtesting includes: measuring a second PSR value resulting from the use ofsaid first image in building said first updated version of saidcorrelation filter; and discarding said first updated version of saidcorrelation filter when said second PSR value is less than a secondthreshold value.
 31. The method of claim 30, wherein said first and saidsecond threshold values are predetermined.
 32. The method of claim 30,wherein said measuring step includes: cross-correlating said firstupdated version of said correlation filter with said first image in saidtraining set, thereby generating said second PSR value.
 33. The methodof claim 30, further comprising: emptying said training set of imageswhen said second PSR value is less than said second threshold value. 34.The method of claim 29, further comprising: cross-correlating said firstupdated version of said correlation filter with a second image in saidsecond set of images, thereby generating a third PSR value; andincluding said second image in said training set of images for saidcorrelation filter when said third PSR value is less than said firstthreshold value.
 35. The method of claim 34, further comprising:building a second updated version of said correlation filter using saidfirst and said second images in said training set; cross-correlatingsaid second updated version of said correlation filter with said firstand said second images in said training set, thereby generating a fourthPSR value; and discarding said second updated version of saidcorrelation filter when said fourth PSR value is less than said secondthreshold value.
 36. The method of claim 29, wherein said testingincludes: discarding said first updated version of said correlationfilter and said training set of images when a quantified valuerepresenting the quality of said first updated version of saidcorrelation filter is below a second threshold value.
 37. The method ofclaim 28, wherein said correlation filter is a MACE (Minimum AverageCorrelation Energy) filter.
 38. A data storage medium containing programcode, which, when executed by a processor, causes said processor toperform the following: build a correlation filter using a first set ofimages from a plurality of images of a subject, wherein said first setcontains at least two of said plurality of images; cross-correlate saidbuilt correlation filter with a first image in a second set of imagesfrom said plurality of images, wherein said second set contains imagesnot contained in said first set of images and wherein saidcross-correlation generates a first figure-of-merit (FOM) value; andinclude said first image in a training set of images for saidcorrelation filter when said first FOM value is less than a firstthreshold value, wherein said training set of images contains a subsetof images from said plurality of images.
 39. The data storage medium ofclaim 38, wherein said program code, upon execution, causes saidprocessor to further perform the following: build a first updatedversion of said correlation filter using said first image in saidtraining set; measure a second FOM value resulting from the use of saidfirst image in building said first updated version of said correlationfilter; and discard said first updated version of said correlationfilter when said second FOM value is less than a second threshold value.40. The data storage medium of claim 39, wherein said program code, uponexecution, causes said processor to further perform the following: emptysaid training set of images when said second FOM value is less than saidsecond threshold value.
 41. The data storage medium of claim 39, whereinsaid program code, upon execution, causes said processor to furtherperform the following: cross-correlate said first updated version ofsaid correlation filter with a second image in said second set ofimages, thereby generating a third FOM value; include said second imagein said training set of images for said correlation filter when saidthird FOM value is less than said first threshold value; build a secondupdated version of said correlation filter using said first and saidsecond images in said training set; cross-correlate said second updatedversion of said correlation filter with said first and said secondimages in said training set, thereby generating a fourth FOM value; anddiscard said second updated version of said correlation filter when saidfourth FOM value is less than said second threshold value.
 42. Acomputer system configured to perform the following: store a pluralityof images of a subject; build a correlation filter using a first set ofimages from said plurality of images, wherein said first set contains atleast two of said plurality of images; cross-correlate said builtcorrelation filter with a first image in a second set of images fromsaid plurality of images, wherein said second set contains images notcontained in said first set of images and wherein said cross-correlationgenerates a first PSR (Peak-to-Sidelobe Ratio) value; create a trainingset of images for said correlation filter and include said first imagein said training set when said first PSR value is less than a firstthreshold value, wherein said training set of images contains a subsetof images from said plurality of images.
 43. The computer system ofclaim 42 configured to further perform the following: build a firstupdated version of said correlation filter using said first image insaid training set; cross-correlate said first updated version of saidcorrelation filter with said first image in said training set, therebygenerating a second PSR value; and discard said first updated version ofsaid correlation filter when said second PSR value is less than a secondthreshold value.
 44. The computer system of claim 43 configured tofurther perform the following: cross-correlate said first updatedversion of said correlation filter with a second image in said secondset of images, thereby generating a third PSR value; include said secondimage in said training set of images for said correlation filter whensaid third PSR value is less than said first threshold value; updatesaid first updated version of said correlation filter by building asecond updated version of said correlation filter using said first andsaid second images in said training set; cross-correlate said secondupdated version of said correlation filter with said first and saidsecond images in said training set, thereby generating a fourth PSRvalue; and discard said second updated version of said correlationfilter when said fourth PSR value is less than said second thresholdvalue.
 45. The computer system of claim 44 configured to further performthe following: empty said training set of images when said fourth PSRvalue is less than said second threshold value.
 46. The computer systemof claim 42 wherein said correlation filter is a MACE (Minimum AverageCorrelation Energy) filter.
 47. A system comprising: means for obtaininga plurality of images of a subject; means for building a MACE (MinimumAverage Correlation Energy) filter using a first set of images from saidplurality of images, wherein said first set contains at least two ofsaid plurality of images; means for applying said MACE filter to a firstimage in a second set of images from said plurality of images so as togenerate a first figure-of-merit (FOM) value, wherein said second setcontains images not contained in said first set of images; means forestablishing a training set of images for said MACE filter, wherein saidtraining set of images contains a subset of images from said pluralityof images; and means for adding said first image to said training setwhen said first FOM value is less than a first threshold value.
 48. Thesystem of claim 47, further comprising: means for building a firstupdated version of said MACE filter using said first image in saidtraining set; means for applying said first updated version of said MACEfilter to said first image in said training set so as to generate asecond FOM value; means for discarding said first updated version ofsaid MACE filter and emptying said training set of images when saidsecond FOM value is less than a second threshold value; means forapplying said first updated version of said MACE filter to a secondimage in said second set of images so as to generate a third FOM value;and means for adding said second image to said training set when saidthird FOM value is less than said first threshold value.
 49. The systemof claim 48, further comprising: means for building a second updatedversion of said MACE filter using said first and said second images insaid training set; means for applying said second updated version ofsaid MACE filter to said first and said second images in said trainingset so as to generate a fourth FOM value; and means for discarding saidsecond updated version of said MACE filter and emptying said trainingset of images when said fourth FOM value is less than said secondthreshold value.
 50. A method, comprising: performing a Fast FourierTransform (FFT) of an image; quantizing one or more frequency samplescontained in said FFT of said image to produce an M-level quantizationof said FFT, wherein said M-level quantization produces a set ofquantized values; and constructing a filter using said set of quantizedvalues.